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A new search metric is discussed that may be used to better assess the 
predictive capability of different math term combinations during the optimiza- 
tion of a regression model of experimental data. The new search metric can be 
determined for each tested math term combination if the given experimental 
data set is split into two subsets. The first subset consists of data points that 
are only used to determine the coefficients of the regression model. The second 
subset consists of confirmation points that are exclusively used to test the re- 
gression model. The new search metric value is assigned after comparing two 
values that describe the quality of the fit of each subset. The first value is the 
standard deviation of the PRESS residuals of the data points. The second value 
is the standard deviation of the response residuals of the confirmation points. 
The greater of the two values is used as the new search metric value. This choice 
guarantees that both standard deviations are always less or equal to the value 
that is used during the optimization. Experimental data from the calibration 
of a wind tunnel strain gage balance is used to illustrate the application of the 
new search metric. The new search metric ultimately generates an optimized 
regression model that was already tested at regression model independent con- 
firmation points before it is ever used to predict an unknown response from a 
set of regressors. 
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= regression coefficients 
= response residual of a data point 
= PRESS residual of a data point 
= data point index 
= confirmation point index 
= normal force of a wind tunnel balance, [lbs] 

= number of data points 
= number of confirmation points 

= electrical outputs of the strain-gages of a wind tunnel balance, [ microV/V ] 


S = response residual of a confirmation point 

a = standard deviation 

tr a = standard deviation of the PRESS residuals of the data points 

crp = standard deviation of the response residuals of the confirmation points 


I. Introduction 


During the past 5 years a candidate math model search algorithm was developed for NASA Ames’ 
Wind Tunnel Division that optimizes regression models of multivariate experimental data (see Ref. [1] and 
[2] for more detail). The goal of the optimization is the automated selection of a regression model, i.e., the 
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recommended math model, for the given data set that (i) meets strict statistical quality requirements and 
(ii) prevents “overfitting” of the data set’s responses. These characteristics of the algorithm make it possible 
for an inexperienced user to apply advanced statistical metrics during a regression analysis of data that the 
user may not be familiar with and that were only available to a highly skilled analyst in the past. 

The algorithm was originally developed for the more efficient analysis of wind tunnel strain-gage balance 
calibration data (see Ref. [3], [4], and [5]). It is, however, also applicable to general global regression analysis 
problems. The algorithm was implemented in a regression analysis software package called BALFIT that 
is used on a regular basis at Ames Research Center for the analysis of wind tunnel strain-gage balance 
calibration data and other experimental data sets. 

Figure 1 depicts key elements of the candidate math model search algorithm that BALFIT uses for 
the regression model optimization and the identification of the recommended math model. The algorithm 
is discussed in great detail in Ref. [1]. In principle, it minimizes a search metric that is a measure of the 
predictive capability of different regression model term combinations that are compared during the search 
(Ref. [1]). In addition, a primary and a secondary search constraint are enforced during the optimization 
that make it possible to examine only those regression models that (i) have statistically significant terms and 
(ii) do not contain near-linear dependencies between terms. The author’s experience has shown that math 
term “hierarchy” is not a necessary condition for a good regression model of experimental data (see also the 
author’s detailed discussion of the hierarchy rule in Ref. [1], pp.6-8). Therefore, the candidate math model 
search algorithm allows the user to enforce math term hierarchy only as an optional constraint that may be 
applied after the completion of the regression model search. 

A significant improvement of the candidate math model search algorithm was made in 2009: A new 
search metric was selected for the algorithm that uses regression model independent confirmation points for 
the first time. Therefore, the new search metric is a better indicator of the predictive capability of math term 
combinations that are tested during the regression model search. The new search metric will be discussed in 
great detail in the next sections of the paper. First, however, two previously used search metrics are reviewed 
in order to illustrate the problem that the new search metric is trying to solve. Then, basic elements of the 
new search metric are discussed. Finally, data from the calibration of a wind tunnel strain-gage balance is 
used to illustrate the application of the new search metric to a realistic experimental data set. 

II. Previously Used Search Metrics 

In the past, two alternate search metrics were used in the candidate math model search algorithm in 
order to identify the recommended math model. They are reviewed in some detail in this section. This 
review will provide a better understanding of the advantages that the recently implemented new search 
metric offers. 

Both previously used search metrics only analyze the subset of an experimental data set that is used 
to obtain the regression coefficients. Points belonging to this subset may be called “data points.” They are 
defined as follows: 


DEFINITION 1: DATA POINTS 

THE SUBSET OF A GIVEN EXPERIMENTAL DATA SET THAT IS EXCLUSIVELY 
USED TO DETERMINE THE COEFFICIENTS OF THE REGRESSION MODEL. 


The first search metric, i.e., Search Metric 1 , was introduced in the initial version of the candidate math 
model search algorithm in 2004 (see Ref. [3]). The metric equals the standard deviation of the response 
residuals of the data points. It can be written as follows: 


Search Metric 1 = cr(di, cfe, d$, • • ■ , di, ■ ■ ■ , d p - 2 , d p _i, d p ) (1) 

' V «' 

response residuals of data points 
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where di equals the response residual (i.e., the difference between the fitted and measured response) of a 
data point, i is the data point index, and p is the total number of data points. Search Metric 1 has a major 
disadvantage: it only assesses the predictive capability of a tested regression model at points that are used to 
determine the regression model. In other words - the regression model is not tested at confirmation points, 
i.e., points that are independent of the regression model coefficients. 

A second search metric, i.e., Search Metric 2, was introduced in 2007 that tries to address the short- 
coming of Search Metric 1 (see Ref. [2]). At that time the author realized that the standard deviation of the 
PRESS residuals of the data points would be a better search metric as the PRESS residual of a data point 
is computed by using the data point as a confirmation point. In addition, PRESS residuals are explicitly 
recommended in the literature for the comparison of the predictive capability of different regression models 
(see Ref. [6], p.142). Search Metric 2 can be summarized as follows 


Search Metric 2 = a(d[, d' 2 , d' 3 , ■ ■ ■ , d\, ■ ■ ■ , d’ p _ 2 , d'^, d' p ) (2) 

' V " 

PRESS residuals of data points 


where d! i equals the PRESS residual of the data point, i is the data point index, and p is the total number 
of data points. The calculation of the PRESS residual of a data point requires several steps. The PRESS 
residual is evaluated by removing the data point from the set of data points, fitting a regression model to the 
remaining data points, and testing the regression model at the omitted data point. Therefore, the standard 
deviation of the PRESS residuals is a metric that uses each data point as a confirmation point, i.e., as a 
point that is not used to generate the regression model when its predictive capability is tested. 

Search Metric 2, however, also has a disadvantage. All data points must be used to develop the final 
regression model of the data set after the completion of the candidate math model search. No data point 
can be omitted. Therefore, Search Metric 2 is really a compromise. Initially, data points are used as 
confirmation points in order to test the regression model. Afterwards, however, all data points are needed 
in order to compute the final values of the regression model coefficients. The author also observed that the 
difference between the standard deviation of the PRESS residuals of the data points ( Search Metric 2) and 
the standard deviation of the response residuals of the data points ( Search Metric 1) continues to become 
smaller as the number of data points is increased. In other words - Search Metric 2 gradually loses its 
diagnostic advantage as the number of data points increases. An alternate search metric had to be found 
that would process independent sets of data points and confirmation points. Ultimately, this conclusion lead 
to the development of Search Metric 3 that will be discussed in the next section of the paper. 

III. New Search Metric 

In 2009 the author developed, implemented, and extensively tested a new search metric, i.e., Search 
Metric 3, in order to address the shortcomings of Search Metric 1 and Search Metric 2 that were discussed 
above. The new search metric was developed after concluding that the data points required to determine the 
regression model coefficients are as important as the confirmation points used to test the regression model. 

It is necessary to define the term “ confirmation point 1 more precisely before Search Metric 3 can be 
explained in detail. Often, data points of an experimental data exist and no confirmation points are available. 
Then, in order to be able to apply Search Metric 3, two options exist that may be used to obtain confirmation 
points. Option 1: Additional confirmation points are aquirecl that do not match any of the data points. 
Option 2: The original experimental data set is split into two subsets; the first subset contains the data 
points that are used to determine the regression coefficients; the second subset contains the confirmation 
points that are used to test the regression model. In many situations it is difficult to apply Option 1 . Then, 
Option 2 is the only way to obtain a suitable set of confirmation points. These confirmation points may be 
defined as follows: 
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DEFINITION 2 : CONFIRMATION POINTS 


THE SUBSET OF A GIVEN EXPERIMENTAL DATA SET THAT IS EXCLUSIVELY USED TO TEST 
THE PREDICTIVE CAPABILITY OF A REGRESSION MODEL . THE COEFFICIENTS OF THE 
REGRESSION MODEL ARE INDEPENDENT OF THE CHOSEN CONFIRMATION POINT SET. 


The basic idea behind Search Metric 3 can be understood if an “ideal” and “overfitted” regression 
analysis result are compared. Let us assume, for example, that the “dependent variable,” i.e., the response, 
of some experimental data is described using a polynomial of a single “independent variable”. Then, an 
“ideal” least squares fit result may look like the one depicted in Fig. 2. The response residuals of the 
data points should have a magnitude that is very similar to the magnitude of the response residuals of 
the confirmation points. This definition of an “ideal” regression analysis result can be expressed using the 
standard deviations of the corresponding residuals. We can write 


DEFINITION 3 : “IDEAL” 

REGRESSION ANALYSIS RESULT 


, (I 2 , ^3 ? * * * j di , • • * , dp— 2, dp— 1 , dp ) 

« <7(81,62, 83, 84, ■ ■ ■ , Sj, ■ ■ ■ , 8q-2, Sq-1, S q ) 

(3a) 

response residuals of data points 

response residuals of confirmation points 



where di is a response residual of a data point with index i and 8j is a response residual of a confirmation 
point with index j. Figure 3 shows an “overfitted” regression analysis result of the same experimental data 
set. Now, the response residuals of the data points have a magnitude that is significantly smaller than the 
magnitude of the response residuals of the confirmation points. This definition of an “overfitted” result can 
be expressed using the standard deviations of the corresponding residuals. Then, we can write: 


DEFINITION 4 : “OVERFITTED” REGRESSION ANALYSIS RESULT 

cr(di , , * * * , di , * * * , dp— 2 , dp— i , dp ) <7(8\, 82 , ^3? ^45 , 8j > ‘ * * 5 8q — 2 > 8q— 1 ? 8 q) (‘88) 


response residuals of data points 


response residuals of confirmation points 


The standard deviation of the response residuals of the data points is often very close to the standard 
deviation of the PRESS residuals of the data points. In addition, both values depend on the data points. 
Therefore, the standard deviation of the response residuals of the data points may be substituted by the 
standard deviation of the PRESS residuals of the data points. We get the relationship: 

cr(di , c? 2 , 83 , • • • , di, • • • , dp— 2 ^ dp—i , dp) ~ cj(di, d%, ^ 3 , • • • , di, • • • , dp_ 2 , dp_i, dp) (4) 

' ' ' 

response residuals of data points PRESS residuals of data points 

The suggested substitution will also make it possible to define Search Metric 3 such that it agrees with 
Search Metric 2 if no confirmation points are available for analysis. 

At this point a question emerges: How can the results presented in Fig. 2 and Fig. 3 be used to define 
a new search metric for the candidate math model search? It is obvious, after comparing the two figures, 
that “overfitting” can be avoided if (i) the standard deviation of the PRESS residuals of the data points and 
(ii) the standard deviation of the response residuals of the confirmation points are of similar magnitude. The 
first value, i.e., the standard deviation of the PRESS residuals of the data points, assesses the quality of the 
tested regression model at the data points. The second value, i.e., the standard deviation of the response 
residuals of the confirmation points, assess the quality of the tested regression model at the confirmation 
points. The “ideal” regression model of the data points should neither favor the quality of the fit at the data 
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points nor the quality of the fit at the confirmation points. This behavior can be guaranteed if the maximum 
of the two values is selected as the search metric. Then, the new metric can be defined as follows: 


Search Metric 3 = A4AX{a a , apj 

(5a) 

where 


G ca ^3’ * * * i ' * * 5 dp— 2^ dp_ i , dp) 

(56) 

PRESS residuals of data points 


up = a(6 1 ,6 2 ,6 3 ,6 i ,---,6 j ,---,6 q - 2 ,6 q - 1 ,6 q ) 

~ 

response residuals of confirmation points 

(5c) 


where <r a equals the standard deviation of the PRESS residuals of the data points and up equals the standard 
deviation of the response residuals of the confirmation points. Figure 4 summarizes the differences between 
Search Metric 1 , Search Metric 2, and Search Metric 3. In addition, Fig. 5 lists the individual steps that are 
needed in order to determine Search Metric 3. 

Search Metric 3, similar to Search Metric 1 and Search Metric 2, uses the standard deviation of a 
given set of residuals in order to define a search metric for the candidate math model search. Therefore, a 
minimum number of data points and confirmation points is required in order to compute the search metric. 
The number of data points is usually sufficiently large (e.g., > 20) so that the standard deviation of the 
PRESS residuals of the data points can be determined. The number of available confirmation points, on the 
other hand, may be limited. Therefore, Search Metric 3 was implemented in the BALFIT software package 
such that it is only used if the number of confirmation points is at least 20. This threshold choice follows 
recommendations that are made in the literature regarding the minimum number of confirmation points (see 
discussion in Ref. [6], p.308). 

In the next section of the paper experimental data from a calibration of NASA’s MC60D wind tunnel 
strain-gage balance will be used (i) to demonstrate the application of the candidate math model search 
algorithm’s new search metric to a realistic data set and (ii) to compare characteristics of the new search 
metric with those of a previously applied search metric. 

IV. Application of New Search Metric to Wind Tunnel Balance Data 

A. Data Description 

A calibration data set of NASA’s MC60D strain-gage balance was selected for the present study (i) to 
demonstrate the application of Search Metric 3 to a realistic data set and (ii) to compare the optimization 
results for the previously used Search Metric 2 with the results for Search Metric 3. The use of a strain-gage 
balance needs to be explained in more detail in order to better understand the regression analysis results that 
are discussed in this section. A strain-gage balance is a measuring device that is used in wind tunnel testing. 
It makes the accurate measurement of a wind tunnel model’s aerodynamic loads possible. These aerodynamic 
forces and moments are obtained after processing the measured electrical outputs of the strain-gages of the 
balance using a multivariate regression model of the balance characteristics. This regression model is the 
result of the regression analysis of high-precision calibration data that relates a set of known balance loads 
to corresponding measured electrical outputs of the gages. The MC60D balance is a modern high capacity 
force type balance that was manufactured by Triumph (Force Measurement Systems) in 2008. The balance 
calibration data was obtained in Triumph’s ABCS calibration machine. This calibration machine makes it 
possible to obtain data for a complete description of the physical behavior of the balance. Six balance loads 
and electrical outputs of the gages were recorded for each point that was taken during the calibration. The 
final calibration data set consisted of 1906 individual loadings of the balance. An initial analysis of the data 
showed that the selected balance calibration data set is of excellent quality. 
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For the present study it was decided to process the data in “direct read format.” Consequently, the 
original calibration loads were converted from “force balance format” (forward normal force, aft normal 
force, forward side force, aft side force, rolling moment, axial force) to “direct read format” (normal force, 
pitching moment, side force, yawing moment, rolling moment, and axial force) using the load transformation 
equations that are listed in Ref. [6] . The electrical outputs of the gages were already corrected for the tare 
weight of the calibration machine hardware and the balance shell. Therefore, they could be used for the 
regression analysis without further modifications. 

B. Iterative versus Non-Iterative Methods 

In the aerospace testing community both iterative and non-iterative methods are used for the regression 
analysis of strain-gage balance calibration data. The iterative method uses the balance loads as “independent 
variables” and the electrical outputs of the gages as “dependent variables” in order to define the regressors 
and responses for the regression analysis. Consequently, the iterative method fits the electrical outputs of the 
gages as a function of the balance loads and then uses an iteration scheme in order to relate the measured 
electrical outputs to balance loads during a wind tunnel test (see Refs. [3], [4], and [7] for more detail). 
The non-iterative method, on the other hand, uses the electrical outputs of the gages as “independent 
variables” and the balance loads as “dependent variables” in order to define the regressors and responses for 
the regression analysis. This method is identical with a classical regression analysis of balance calibration 
data that directly fits each balance load as a function of the electrical outputs of the gages. 

Studies of the author showed that the iterative and the non-iterative method have the same balance load 
prediction accuracy as long as (i) the balance calibration experiment is well designed and (ii) the regression 
models of the calibration data meet strict statistical quality requirements. Therefore, in order to simplify 
the discussion of the application of Search Metric 3 to the chosen experimental data set, it was decided (i) to 
select the non-iterative method for the regression analysis and (ii) to optimize the regression model of only 
the normal force component. Consequently, the normal force ( NF ) was treated as the “dependent variable,” 
i.e., as the “response,” and the electrical outputs of the gages (Rl, R2, • • • , R6) were treated as “independent 
variables,” i.e., as variables that define the “regressors” for the regression analysis. 

C. Regression Analysis of Normal Force Data 

The regression analysis and regression model optimization of the normal force data of the balance was 
performed for two different data set examples. This choice made it possible to compare the predictive 
capability of two optimized regression models that are the result of using either Search Metric 2 or Search 
Metric 3. Table 1 below lists key features of the two selected examples that were processed using the 
regression model optimization process. The threshold values for the primary and secondary search constraints 
as listed as well. The optional math term hierarchy constraint was not applied. 

Table 1: Description of Data Analysis Examples. 



EXAMPLE 1 

EXAMPLE 2 

DEPENDENT VARIABLE (RESPONSE) 

NF 

NF 

INDEPENDENT VARIABLES (DEFINE REGRESSORS) 

Rl, R2, ■ ■ ■ ,R6 

Rl, R2, • • • , R6 

NUMBER OF DATA POINTS (p) 

1906 

201 

NUMBER OF CONFIRMATION POINTS (q) 

0 

1705 

SEARCH METRIC USED FOR OPTIMIZATION 

Search Metric 2 

Search Metric 3 

PRIMARY CONSTRAINT (P-VALUE OF T-STATISTIC) 

< 0.0001 

< 0.0001 

SECONDARY CONSTRAINT (VARIANCE INFLATION FACTOR) 

< 5 

< 5 

HIERARCHY CONSTRAINT (OPTIONAL) 

not applied 

not applied 


The first example, i.e., Example 1 , was processed assuming that no confirmation points were available 
for the regression model optimization. Therefore, all 1906 calibration points of the experimental data set 
were considered to be data points and only Search Metric 2 could be used during the optimization. 

The second example, i.e., Example 2 was constructed so that the new Search Metric 3 could be applied. 
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Therefore, the original experimental data set of 1906 calibration points was split into two subsets. The split 
was performed by randomly selecting about 10% of the calibration points, i.e., 201 points, and assigning 
them to be data points. Sufficient information about the normal force characteristics was contained in the 
randomly chosen data points so that a meaningful regression analysis of the normal force data was possible. 
The remaining 90% of the calibration points, i.e., 1705 points, were assigned to be confirmation points. 

A multivariate quadratic was chosen to be the upper bound of all math models that were investigated 
during the optimization of the two regression models. This upper bound has the following form: 

NF = ai a, 2 ' 7?1 -T ci 3 • R2 T • • • -T oj • RG -t- a§ • (i71) 2 -T og • (i72) 2 

( 6 ) 

ni 3 • (RG) T ai4 • (R 1 • R2) -T U 15 • (R1 • R3 ) -(-•••-(- 02 s * (R5 • RG) 

Figure 7a shows the independent variables, i.e., the electrical outputs of the strain-gages, that were 
used for the regression model optimization of Example 1. A total number of 1906 data points was used for 
the calculation of the regression coefficients. Figure 7b shows the dependent variables (responses), i.e., the 
normal forces, that were fitted during the regression model optimization of Example 1. Figure 7c depicts 
Search Metric 2 for Example 1 as a function of the number of regression model terms. This metric was 
minimized during the optimization. Figures 7cl and 7e show the p-value of t-statistic maximum (primary 
search constraint) and the variance inflation factor maximum (secondary search constraint) as a function 
of the regression model terms. The most conservative thresholds (< 0.0001 and < 5) were chosen for the 
constraints during the optimization. Table 2 below shows the terms of the recommended math model that 
was obtained for Example 1. This optimized regression model consists of a total of 24 terms. 

Table 2: Recommended Math Model Terms for Example 1 and Example 2. 


INDEX 

MATH TERM 

EXAMPLE 1 

EXAMPLE 2 

1 

intercept (constant) 

X 

X 

2 

R1 

X 

X 

3 

R2 

X 

X 

4 

R3 

X 

X 

5 

R4 

X 

X 

6 

R5 

X 

X 

7 

RG 

X 

X 

8 

(Rl) 2 

X 

X 

9 

(R2 ) 2 

X 

X 

10 

(R3) 2 

X 

- 

11 

(i?4) 2 

- 

- 

12 

(i?5) 2 

X 

X 

13 

(i?6) 2 

X 

- 

14 

(i?l • R2) 

X 

- 

15 

(R1 • f?3) 

- 

- 

16 

(R1 ■ RA) 

X 

- 

17 

(R1 ■ R5) 

X 

X 

18 

(R1 ■ R6) 

- 

- 

19 

(R2 ■ R3) 

X 

- 

20 

(R2 • RA) 

X 

- 

21 

{R2 ■ R5) 

X 

X 

22 

(R2 ■ R6) 

X 

- 

23 

(i?3 • RA) 

X 

- 

24 

(i?3 • R5) 

X 

X 

25 

(i?3 • R6) 

X 

- 

26 

( RA ■ RA) 

X 

X 

27 

(i?4 • R6) 

- 

- 

28 

(RA ■ R6) 

X 

- 
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Figure 7f shows the analysis of variance results for the recommended math model of Example 1. The 
analysis of variance results and other metrics depicted in Fig. 7f indicate that the recommended math 
model met all statistical quality requirements that were imposed during the optimization process. Figure 7g 
shows the response (normal force) residuals of the data points, i.e. , the difference between the fitted and 
applied normal force, as a function of the applied normal force. The standard deviation of the response 
residuals is only a very small percentage (0.0646%) of the largest normal force magnitude that was applied 
during the balance calibration. Therefore, the 24 term recommended math model of Example 1 is a good 
characterization of the expected behavior of normal force of the balance during a wind tunnel test. 

Figure 8a shows the independent variables, i.e, the electrical outputs of the strain-gages, that were used 
for the regression model optimization of Example 2. This time, only the 201 randomly selected data points 
were used to calculate the regression coefficients. Figure 8b shows the dependent variables (responses), i.e., 
the normal forces, that were fitted during the regression model optimization of Example 2. Figure 8c depicts 
Search Metric 3 for Example 2 as a function of the number of math terms. This metric was minimized during 
the search. Figures 8d and 8e show the p-value of t-statistic maximum (primary search constraint) and the 
variance inflation factor maximum (secondary search constraint) as a function of the number of math terms 
for Example 2. Table 2 above lists terms of the recommended math model that was obtained for Example 2. 
This time, the optimized regression model consists of only 14 terms, i.e, 10 fewer terms than the optimized 
model of Example 1. 

Figure 8f shows the analysis of variance results for the recommended math model of Example 2. The 
analysis of variance results and other metrics shown in Fig. 8f indicate that the recommended math model 
of Example 2 met all statistical quality requirements that were imposed during the optimization process. 
Figure 8g shows the response (normal force) residuals of the data points, i.e., the difference between the fitted 
and applied normal force, as a function of the applied normal force. The standard deviation of the response 
residuals of the data points is only a very small percentage (0.0718%) of the largest normal force magnitude 
that was applied during the balance calibration. Figure 8h shows the response (normal force) residuals of 
the confirmation points as a function of the applied normal force. Again, as the 1705 confirmation points 
were used during the optimization to independently test the predictive capability of the processed math term 
combinations, the standard deviation of the response residuals of the confirmation points is only a very small 
percentage (0.0765%) of the largest normal force magnitude that was applied during the balance calibration. 
Both the standard deviation of the response residuals of the data points and the standard deviation of the 
response residuals of the confirmation points are of about the same magnitude and very small. Therefore, 
the 14 term recommended math model of Example 2 is also an excellent characterization of the expected 
behavior of normal force of the balance during a wind tunnel test. 

The hierarchy rule was not explicitly applied during the optimization of the two regression models (see 
Table 1). However, its is interesting to note that the optimized models listed in Table 2, Fig. 7f, and Fig. 8f 
turned out to be both hierarchical. Therefore, it can be concluded that the original calibration data set of 
the MC60D balance contains data that supports a hierarchical regression model. 

A question remains: Which one of the two recommended math models of the calibration data is ex- 
pected to have better predictive capabilities? Several observations can be made after comparing the two 
recommended math models. - Observation 1: Comparing the standard deviation of the response residuals 
of the 1906 data points that were used to determine the recommended math model of Example 1 (0.0646%, 
see Fig. 7g) with the standard deviation of the response residuals of the 201 data points that were used to 
determine the recommended math model of Example 2 (0.0718%, see Fig. 8g) we see that the difference of 
the values is 0.0072%. - Observation 2: Comparing the standard deviation of the response residuals of the 
1906 data points that were used to determine the recommended math model of Example 1 (0.0646%, see 
Fig. 7g) with the standard deviation of the response residuals of the 1706 confirmation points that were used 
to test the recommended math model of Example 2 (0.0765%, see Fig. 8h) we see that the difference of the 
values is 0.0119%. - Observation 3: The recommended math model of Example 2 was obtained using only 
about 10% of the data points that were required to obtain the recommended math model of Example 1. 
Observation f: The recommended math model of Example 2 uses only 14 of the 24 math terms that the 
recommended math model of Example 1 uses. - Observation 5: The recommended math model of Example 1 
was not tested at regression coefficient independent confirmation points. 

It is concluded, after reviewing the five observations, that the difference between the standard deviations 
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of the two recommended math models is relatively small (« 0.01% of the largest normal force magnitude). 
This observation is remarkable considering the significant differences in (i) the number of data points that 
were used to develop the two regression models and (ii) the number of math terms of the two models. Both 
math models appear to have similar predictive capabilities. The recommended math model of Example 2, 
however, has the advantage that it was already successfully tested at regression coefficient independent 
confirmation points. Therefore, it appears that the recommended math model of Example 2 is the better 
and more reliable regression model for the given calibration data of the MC60D balance. 

D. Tare Corrected Confirmation Point Loads 

Tare corrections are used in strain-gage balance calibration data analysis in order to correct the original 
balance calibration data for the effects of the weight of the calibration hardware and the balance shell. 
The manufacturer of the MC60D balance supplied the calibration data set in “tare corrected” format. 
Consequently, the data could directly be used for the regression analysis example that is discussed above. 
Often, however, balance calibration data is provided for analysis that still needs to be tare corrected. Different 
approaches are used in the aerospace testing community to estimate these tare corrections. One approach, 
the so-called “tare load iteration,” estimates tare corrections for the given balance calibration loads (see 
Ref. [3], pp. 13-14, or Ref. [7], p.17 for a discussion of the tare load iteration process). Now a question must 
be asked: What impact do these tare corrections have on confirmation points that may be obtained after 
splitting an uncorrected balance calibration data set into a subset of data points and a subset of confirmation 
points ? In general, tare corrections have to be computed for each balance calibration load series assuming 
that the weight of the calibration fixtures changes from load series to load series. Some points of a load series 
may belong to the subset of data points. Other points of the same load series may belong to the subset of 
confirmation points. All points of a load series, however, have the same tare load corrections. Therefore, the 
tare load corrections need to be estimated using the data points of the load series and afterwards applied to 
both the loads of the data points and the loads of the confirmation points before the calibration data set can 
be processed using the new search metric and the regression model optimization algorithm. 

V. Summary and Conclusions 

A regression model optimization algorithm is currently being used at NASA Ames’ Wind Tunnel Division 
that identifies an optimized regression model, i.e., the so-called recommended math model, for a given 
multivariate experimental data set. This optimized regression model is designed (i) to meet strict statistical 
quality requirements and (ii) to prevent “overfitting” of the experimental data set’s responses. A new search 
metric was implemented in the optimization algorithm in 2009. This new search metric simultaneously 
uses data points and confirmation points in order to better assess and compare the predictive capability of 
different math term combinations that are tested during the optimization. 

Experimental data from a calibration of NASA’s MC60D wind tunnel strain-gage balance was used 
in the present study to illustrate the application of the new search metric during the regression model 
optimization of a realistic data set. The optimization results for the new search metric were compared with 
results that were obtained for a previously used search metric. This comparison showed that the new search 
metric has the ability to generate a math model that is expected to have better predictive capabilities than 
an optimized regression model that is obtained using a confirmation point independent search metric. 

It is interesting to note that all original points of a given experimental data set are used during the 
optimization of the regression model if the new search metric is applied. None of the information is discarded 
that may be contained in the experimental data set. Only a portion of the experimental data set, however, 
is needed to determine the coefficients of the regression model. The new search metric ultimately generates 
an optimized regression model of the experimental data set that was already tested at regression coefficient 
independent confirmation points before it is ever used to predict an unknown response from a set of regressors. 
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Fig. 1 Key elements of candidate math model search algorithm. 
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Fig. 2 “Ideal” regression analysis result of experimental data. 


11 

American Institute of Aeronautics and Astronautics 








DEPENDENT 



Fig. 3 “ Overfitted” regression analysis result of experimental data. 
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Fig. 4 Search metric options for a candidate math model search. 
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Fig. 5 Determination of Search Metric 3. 



Fig. 6 Implementation of Search Metric 3 in candidate math model search algorithm. 
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Fig. 7a Example 1: Electrical outputs of strain-gages versus data point index (1906 data points). 
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Fig. 7b Example 1: Normal forces versus data point index (1906 data points). 
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Fig. 7c Example 1: Search Metric 2 versus number of regression model terms. 
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Fig. 7d Example 1: Primary search constraint versus the number of regression model terms. 
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Fig. 7f Example 1: Analysis of variance results for the recommended math model with 24 terms. 
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Fig. 7g Example 1: Response residuals of data 'points that were used to determine the recommended math model. 
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Fig. 8a Example 2: Electrical outputs of strain-gages versus data point index (201 data points). 
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Fig. 8b Example 2: Normal forces versus data point index (201 data points). 
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Fig. 8c Example 2: Search Metric 3 versus number of regression model terms. 
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Fig. 8d Example 2: Primary search constraint versus the number of regression model terms. 
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Fig. 8e Example 2: Secondary search constraint versus the number of regression model terms. 
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Fig. 8f Example 2: Analysis of variance results for the recommended math model with 14 terms. 
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Fig. 8g Example 2: Response residuals of data points that were used to determine the recommended math model. 
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Fig. 8h Example 2: Response residuals of confirmation points that were used to test the recommended math model. 
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